Efficient multi-step query processing for EMD-based similarity
نویسندگان
چکیده
Similarity search in large multimedia databases requires efficient query processing based on suitable similarity models. Similarity models consist of a feature extraction step as well as a distance defined for these features, and they demand an efficient algorithm for retrieving similar objects under this model. In this work, we focus on the Earth Movers Distance (EMD), a recently introduced similarity model which has been successfully employed in numerous applications and has been reported as well reflecting human perceptual similarity. As its computation is complex, the direct application of the EMD to large, high-dimensional databases is not feasible. To remedy this and allow users to benefit from the high quality of the model even in larger settings, we developed various lower bounds for the EMD to be used in index-supported multistep query processing algorithms. We prove that our algorithms are complete, thus producing no false drops. We also show that it is highly efficient as experiments on large image databases with high-dimensional features demonstrate.
منابع مشابه
Efficient User-Adaptable Similarity Search in Large Multimedia Databases
Efficient user-adaptable similarity search more and more increases in its importance for multimedia and spatial database systems. As a general similarity model for multi-dimensional vectors that is adaptable to application requirements and user preferences, we use quadratic form distance functions which have been successfully applied to color histograms in image databases [Fal+ 94]. The compone...
متن کاملIndexing Earth Mover's Distance over Network Metrics
The Earth Mover’s Distance (EMD) is a well-known distance metric for data represented as probability distributions over a predefined feature space. Supporting EMD-based similarity search has attracted intensive research effort. Despite the plethora of literature, most existing solutions are optimized for Lp feature spaces (e.g., Euclidean space); while in a spectrum of applications, the relatio...
متن کاملEfficient and effective similarity search on complex objects
Due to the rapid development of computer technology and new methods for the extraction of data in the last few years, more and more applications of databases have emerged, for which an efficient and effective similarity search is of great importance. Application areas of similarity search include multimedia, computer aided engineering, marketing, image processing and many more. Special interest...
متن کاملAn Efficient Query Algorithm for Trajectory Similarity Based on Fréchet Distance Threshold
The processing and analysis of trajectories are the core of many location-based applications and services, while trajectory similarity is an essential concept regularly used. To address the time-consuming problem of similarity query, an efficient algorithm based on Fréchet distance called Ordered Coverage Judge (OCJ) is proposed, which could realize the filtering query with a given Fréchet dist...
متن کاملMirex2008: Query by Humming/singing System
This extended abstract describes my submission to the QBSH (Query by Singing/Humming) task of MIREX (Music Information Retrieval Evaluation eXchange) 2008. The system takes advantage of note-based and frame-based matching methods to improve the accuracy of the Query by Singing/Humming system. First, Earth Mover’s Distance (EMD), which is note-based and much faster, is adopted to eliminate most ...
متن کامل